A Neural Framework for Web Ranking Using Combination of Content and Context Features
نویسندگان
چکیده
Containing enormous amounts of various types of data, web has become the main source for finding the desired information. Meanwhile retrieving the desired information in such a vast heterogeneous environment is much difficult. This situation has led to a drastic increase in the popularity of internet search engines. Undoubtedly, designing both efficient and effective ranking strategies as the basic core of web information retrieval systems are unavoidable. Unfortunately most of the proposed ranking algorithms do not work very well over general datasets because of their fixed configurations. Many of these algorithms also suffer from their computational costs. Regarding these shortcoming, in this paper, a new ranking framework named "NNRank" is proposed which uses the primitive features of web documents from the categories of content and context using an artificial neural network. The neural networks selected in our approach is a radial basis function or a principle component analysis neural network which due to their high convergence rate, have the capability to exhibit a high performance with a limited number of features. Experimental results based on TREC 2004 gathered in Microsoft LETOR dataset, indicate a noticeable enhancement comparing to the well-known ranking algorithms such as TF-IDF, PageRank and HITS. The results are also comparable with those of BM25.
منابع مشابه
Situational Context for Ranking in Personal Search
Modern search engines leverage a variety of sources, beyond the conventional query-document content similarity, to improve their ranking performance. Among them, query context has attracted attention in prior work. Previously, query context was mainly modeled by user search history, either long-term or short-term, to help the ranking of future queries. In this paper, we focus on situational con...
متن کاملEffective Learning to Rank Persian Web Content
Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...
متن کاملDiscrimination of Power Quality Distorted Signals Based on Time-frequency Analysis and Probabilistic Neural Network
Recognition and classification of Power Quality Distorted Signals (PQDSs) in power systems is an essential duty. One of the noteworthy issues in Power Quality Analysis (PQA) is identification of distorted signals using an efficient scheme. This paper recommends a Time–Frequency Analysis (TFA), for extracting features, so-called "hybrid approach", using incorporation of Multi Resolution Analysis...
متن کاملCombination of Documents Features Based on Simulated Click-through Data
Many different ranking algorithms based on content and context have been used in web search engines to find pages based on a user query. Furthermore, to achieve better performance some new solutions combine different algorithms. In this paper we use simulated click-through data to learn how to combine many content and context features of web pages. This method is simple and practical to use wit...
متن کاملRRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کامل